Efficient Processing of XML Path Queries Using the Disk-based F&B Index
نویسندگان
چکیده
With the proliferation of XML data and applications on the Internet, efficient XML query processing techniques are in great demand. Answering queries using XML indexes is a natural approach. A number of XML indexes have been proposed in the literature; among them, F&B Index is one powerful index as it is the smallest index that answers all twig queries. However, an F&B Index suffers from the following two problems: (1) it was originally proposed as a memory-based index while its size is usually large in practice and (2) answering queries using an F&B Index is not fully optimized. These problems limit the benefits and even applications of F&B Indexes in practice. In this paper, we propose a highly optimized disk organization method for an F&B Index; the result is a disk-based F&B Index with good clustering properties. In addition, novel query processing algorithms exploiting the physical organization of the disk-based F&B Indexes are proposed. Experimental results verify that our disk-based F&B Index can scale up for large data size with good query performance compared with state-ofthe-art XML query processing algorithms. ∗This work was partially supported by UNSW FPG Grant (PS06863), UNSW Goldstar Grant (PS07248), and ARC Discovery Grant (DP0346004). Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005
منابع مشابه
Practical Indexing XML Document for Twig Query
Answering structural queries of XML with index is an important approach of efficient XML query processing. Among existing structural indexes for XML data, F&B index is the smallest index that can answer all branching queries. However, an F&B index for less regular XML data often contains a large number of index nodes, and hence a large amount of main memory. If the F&B index cannot be accommoda...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملAB-Index: An Efficient Adaptive Index for Branching XML Queries
Query-adaptive XML indexing has been proposed and shown to be an efficient way to accelerate XML query processing, because it dynamically adapts to the workload. However, existing adaptive index lack of support for branching queries, and also with low efficiency for query processing and adaptation operations. In this paper, we propose a new Adaptive index for Branching queries, which is named a...
متن کاملEfficient Processing of Expressive Node-Selecting Queries on XML Data in Secondary Storage: A Tree Automata-based Approach
We propose a new, highly scalable and efficient technique for evaluating node-selecting queries on XML trees which is based on recent advances in the theory of tree automata. Our query processing techniques require only two linear passes over the XML data on disk, and their main memory requirements are in principle independent of the size of the data. The overall running time is O(m + n), where...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005